Skip to content

REP-6088 Fix display of verification summary #119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 23, 2025

Conversation

FGasper
Copy link
Collaborator

@FGasper FGasper commented Jun 12, 2025

PR #117 introduced some bugs in the verification summary logging. This addresses the following:

  • Mismatches were previously shown in an indeterminate order. Now they’re consistently sorted by the mismatched documents’ _id.
  • Documents with missing fields were being logged as entirely missing. That logic is corrected here.
  • The logic to create the table of missing/changed documents previously iterated through the _ids persisted in the task rather than the actual missing/changed documents. This was appropriate when that list stored mismatches but is no longer correct since the list now always stores the list of documents to check in the task. Thus, if there were only a handful of missing documents in a recheck task that contained thousands of document IDs, all of that task’s document IDs would be logged as missing. This was an oversight from PR REP-6088 Tolerate high numbers of mismatches #117, which should have updated the logic to build that table as it migrated that for the mismatched-documents table. This changeset does the necessary update.

Additional, small changes here, which I’ll merge in separate commits:

  • build.sh now logs the git commit & time that it’s building into the binary.
  • The compareOneDocument() method is moved to compare.go, which more sensible since that’s where it’s called.

@FGasper FGasper changed the title Fix display of verification summary REP-6088 Fix display of verification summary Jun 14, 2025
@FGasper FGasper requested review from tdq45gj and khodakovski June 16, 2025 12:29
@FGasper FGasper marked this pull request as ready for review June 20, 2025 04:44
Copy link
Collaborator

@tdq45gj tdq45gj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left one question about performance but not blocking.

@@ -49,6 +50,11 @@ func getMismatchesForTasks(
bson.D{
{"task", bson.D{{"$in", taskIDs}}},
},
options.Find().SetSort(
bson.D{
{"detail.id", 1},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only useful for logging? I wonder if it's going to make the query slower by sorting here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should just sort the results it already has, so I don‘t expect a performance problem.

The status quo is that every presentation of the table shows different failures, which doesn’t seem very user-friendly.

Copy link
Collaborator

@khodakovski khodakovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you.

FGasper added 3 commits June 23, 2025 13:33
- Mismatches were previously shown in an indeterminate order. Now they’re
  consistently sorted by the mismatched documents’ `_id`.
- Documents with missing fields were being logged as entirely missing.
  That logic is corrected here.
- The logic to create the table of missing/changed documents previously
  iterated through the _ids persisted in the task rather than the actual
  missing/changed documents. This was appropriate when that list stored
  mismatches but is no longer correct since the list now always stores
  the list of documents to check in the task. Thus, if there were only a
  handful of missing documents in a recheck task that contained thousands
  of document IDs, all of that task’s document IDs would be logged as
  missing. This was an oversight from PR mongodb-labs#117, which should have updated
  the logic to build that table as it migrated that for the
  mismatched-documents table. This changeset does the necessary update.
@FGasper FGasper force-pushed the felipe_fix_summary branch from 6c9de7a to a391a17 Compare June 23, 2025 17:37
@FGasper FGasper merged commit e3b46be into mongodb-labs:main Jun 23, 2025
100 checks passed
@FGasper FGasper deleted the felipe_fix_summary branch June 23, 2025 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants